#Ina: double check the parts with “notes” before submission

Rationale and Research Questions

Background

In the context of escalating global environmental challenges, the shift from traditional fossil fuel-based energy sources to renewable energy has become a focal point in efforts to reduce carbon emissions and combat climate change (Kabeyi et. al, 2022). However, the transition’s broader environmental impacts, particularly on air quality, remain less explored. This transition is especially relevant given the increasing global energy demand and the need to meet this demand sustainably.

Significance

Understanding the relationship between renewable energy generation and air quality is crucial. Renewable energy sources like wind and solar are lauded for their lower environmental impact compared to fossil fuels, which are major contributors to air pollution (UCSUSA, 2018). Air pollution is a significant environmental hazard, affecting human health, ecosystems, and the climate. It is responsible for millions of premature deaths annually and contributes to the occurrence of diseases like asthma, heart disease, and lung cancer (National Geographic, 2023). Therefore, assessing how increased renewable energy generation affects air quality indicators is not just an environmental concern but a public health imperative.

Theoretical Context

The hypothesis underlying this research is that an increase in renewable energy generation leads to a reduction in air pollution. This hypothesis is grounded in the understanding that renewable energy sources, unlike fossil fuels, do not emit pollutants like sulfur dioxide, nitrogen oxides, and particulate matter during electricity generation.

Research Questions

1: What is the relationship between distributed renewable energy generation and the level of air pollution?

This question aims to investigate the correlation between the rise in renewable energy generation and the concentrations of various air pollutants. It seeks to understand whether regions with higher renewable energy output exhibit lower levels of air pollutants.

2: Among air quality indicators (PM10, PM2.5, CO, NO2, and SO2), which display the most significant response to variations in energy generation?

This question delves deeper into identifying which specific pollutants are most responsive to changes in energy generation types. It is crucial for pinpointing the environmental benefits of renewable energy sources and for policy-making aimed at targeted air pollution reduction.

Dataset Information

The exploratory analysis required the combination of air quality and power plant datasets. Air quality data in the analysis was obtained from the United States Environmental Protection Agency while power plant data was obtained from the U.S. Energy Information Administration. #notes: add data processed process The sample years were the two decades namely 2001 - 2021.

#notes: confirm variable in the dataset information is raw data
Dataset Information for the Sample Period 2001 - 2021
Dataset Source Variables
Air Quality Summary Statistics by Criteria Pollutants and Location EPA Air Quality System (AQS) Daily PM2.5, PM10, SO2, NO, and CO Concentrations
Power Plant Generator Level Capacities and Locations EIA Form EIA-860 Annual Installed Generation Capacity by Fuel Type
Power Plant Monthly Energy Generation EIA Form EIA-923 Monthly Net Generation by Fuel Type

Exploratory Analysis

We began our exploratory analysis by examining if and how solar and wind energy generating capacity, net generation and air quality has changed over time in the contiguous United States. We first used the wrangled power plant to visualize the changes in the total installed capacity of solar and wind plants from 2001 to 2021. Figure 1 shows that California, Texas, and Iowa hold the highest cumulative installed capacity, suggesting that these states have significant growth in their installed capacity compared to other states. In addition, we plotted the annual energy generation from solar and wind sources over the same period of time, observing that the states with the highest installed capacity also exhibit the largest annual energy generation from these renewable sources (Figure 2 and Figure 3).

Figure. Annual Installed Generation Capacity: Solar and Wind (MW)

Figure. Annual Installed Generation Capacity: Solar and Wind (MW)

Figure. Annual Solar and Wind Energy Generation over time (TWh)

Figure. Annual Solar and Wind Energy Generation over time (TWh)

Figure. Monthly Renewable Energy Generation per State (TWh)

Figure. Monthly Renewable Energy Generation per State (TWh)

The change in installed solar and wind plants can be also be visualized spatially across the contiguous United States as illustrated below. From the map, we note that the number of installed solar and wind plants increased significantly over the two decades from 2001 to 2021.

We utilized ggplot and gganimate to visualize the increase and distribution of plants within the states that had the highest growth in installed capacity. We thus found the top three states and created a visualization that is shown below:

Solar and Wind Power Plant Locations
Solar and Wind Power Plant Locations

After analyzing the increasing installed capacity of solar and wind plants in the United States, with a special focus on California, Texas, and Iowa, we proceeded to examine air quality data to determine the changes in the concentration of key criteria pollutants over time. To simplify the process, we used visualizations to demonstrate the wrangled air quality data over time for essential pollutants linked to fossil fuel generation, such as SO2, NOX, and PM2.5. From the results presented below, it is evident that the amount of pollutants measured has been decreasing over time.

#notes: why exclude PM10 and CO?

PM2.5 Concentration

PM2.5 Concentration

SO2 Concentration

SO2 Concentration

NO2 Concentration

NO2 Concentration

To investigate the trend in the measured pollutants over time, we conducted a time series analysis of the measured values of each of the three pollutants in California, Texas, and Iowa from 2001 to 2021. Our goal was to determine whether there has been a change in the recorded PM2.5, SO2, and NO2 concentrations over the sample period. The null hypothesis is that there has been no change in the recorded PM2.5, SO2 and NO2 concentrations in the three states over the sample period. The alternative hypothesis is that there has been a change in the recorded PM2.5, SO2 and NO2 concentrations in the three states over the sample period.

After decomposing the time series, we observed that each of the three pollutants has a seasonal component as observed in the time series plots shown below. Hence we run the seasonal Mann-Kendall (SMK) test on each dataset, which produced a p-value of less than 0.05 (<0.05) for each time series. As a result, we can reject the null hypothesis, and the analysis indicates that there has been a change in the recorded PM2.5, SO2, and NO2 concentrations over the period 2001 - 2021 in California, Texas, and Iowa. The negative tau values suggests a negative correlation which implies that the change for each pollutant in each state has been a decrease.

Results of Seasonal Mann-Kendall test
Trend tau 2-sided pvalue
CA_PM2.5 -0.4198413 0
CA_SO2 -0.7531746 0
CA_NOX -0.7285714 0
TX_PM2.5 -0.3888889 0
TX_SO2 -0.4880952 0
TX_NOX -0.7150794 0
IA_PM2.5 -0.4238095 0
IA_SO2 -0.5976190 0
IA_NOX -0.6715447 0

Analysis

Question 1:

What is the relationship between distributed renewable energy generation and the level of air pollution?

We can formulate a null and alternative hypothesis for the above research question as follows:

H0: There is no change in recorded air quality with an increase in renewable energy generation in the states of California, Texas and Iowa over the period 2001 - 2021.
Ha: There is a change in recorded air quality with an increase in renewable energy generation in the states of California, Texas and Iowa over the period 2001 - 2021.

To evaluate this hypothesis, we generated a plot of Mean PM2.5 measured against net monthly solar and wind generation for all three states.

The figures above suggest that the measured value of the pollutants has an inverse relationship or negative correlation with net generation from solar and wind. This implies that the higher the amount of energy generated from wind and solar, the lower the amount of the three criteria pollutants. To investigate this further, we performed a simple linear regression of the relationship between the mean quantity of each pollutant and net energy generation with the results summarized in the table outlined below:

Simple Linear Regression Results

Characteristic PM2.5 SO2 NO2
Beta 95% CI1 p-value Beta 95% CI1 p-value Beta 95% CI1 p-value
California
Net Generation -0.77 -1.1, -0.46 <0.001 -0.21 -0.24, -0.18 <0.001 -1.8 -2.1, -1.6 <0.001
0.088

0.445

0.464

AIC 1,375

189

1,255

σ 3.68

0.349

2.89

Texas
Net Generation -0.28 -0.35, -0.20 <0.001 -0.09 -0.12, -0.07 <0.001 -0.47 -0.57, -0.37 <0.001
0.177

0.213

0.255

AIC 1,000

401

1,148

σ 1.75

0.532

2.34

Iowa
Net Generation -1.3 -1.6, -0.98 <0.001 -0.64 -0.74, -0.55 <0.001 -1.5 -1.8, -1.2 <0.001
0.214

0.423

0.352

AIC 1,161

563

1,053

σ 2.41

0.733

1.99

1 CI = Confidence Interval

#Interpretation (to be edited)

PM2.5 and Net Generation:

The beta coefficient of net renewable energy generation which is the slope of the regression line estimating the relationship between PM2.5 concentration and renewable energy generation is -0.77, -0.28 and -1.3 respectively for California, Texas and Iowa. This implies that an increase in solar and wind energy generation is resulting in a decrease in concentrations of PM2.5. Assuming a confidence interval of 0.05, the p values are less than our confidence level which implies that the results are statistically significant and there is a significant negative correlation between PM2.5 concentration and renewable energy generation. Based on the R-squared values, 8.8%, 17.7% and 21.4% of the total variance in PM2.5 concentration in California, Texas and Iowa respectively can be explained by renewable energy generation.

Sulfur Dioxide (SO2) and Net Generation:

The beta coefficient of net renewable energy generation which is the slope of the regression line estimating the relationship between SO2 concentration and renewable energy generation is -0.21, -0.09 and -0.64 respectively for California, Texas and Iowa. This implies that an increase in solar and wind energy generation is resulting in a decrease in concentrations of SO2. Assuming a confidence interval of 0.05, the p values are less than our confidence level which implies that the results are statistically significant and there is a significant negative correlation between SO2 concentration and renewable energy generation. Based on the R-squared values, 44.5%, 21.3% and 42.3% of the total variance in SO2 concentration in California, Texas and Iowa respectively can be explained by renewable energy generation.

Nitrogen Dioxide (NO2) and Net Generation:

The beta coefficient of net renewable energy generation which is the slope of the regression line estimating the relationship between NO2 concentration and renewable energy generation is -1.8, -0.47 and -1.5 respectively for California, Texas and Iowa. This implies that an increase in solar and wind energy generation is resulting in a decrease in concentrations of NO2. Assuming a confidence interval of 0.05, the p values are less than our confidence level which implies that the results are statistically significant and there is a significant negative correlation between NO2 concentration and renewable energy generation. Based on the R-squared values, 46.4%, 25.5% and 35.2% of the total variance in NO2 concentration in California, Texas and Iowa respectively can be explained by renewable energy generation.

So, in conclusion, these observations suggest an inverse relationship where increased renewable energy generation could be associated with decreased emissions of PM2.5, SO2 and NO2, which are pollutants typically associated with the combustion of fossil fuels.

Multiple Linear Regression Results

Characteristic PM2.5 SO2 NO2
Beta 95% CI1 p-value Beta 95% CI1 p-value Beta 95% CI1 p-value
California
Net Generation -0.29 -0.82, 0.25 0.3 0.07 0.04, 0.10 <0.001 -1.7 -2.1, -1.2 <0.001
Month 0.27 0.15, 0.40 <0.001 0.01 0.00, 0.01 0.11 0.18 0.08, 0.28 <0.001
Year -0.14 -0.27, -0.01 0.033 -0.08 -0.09, -0.08 <0.001 -0.04 -0.15, 0.06 0.4
0.163

0.818

0.490

AIC 1,357

-87.6

1,246

σ 3.53

0.201

2.83

Texas
Net Generation -0.10 -0.30, 0.11 0.4 0.21 0.17, 0.26 <0.001 0.40 0.15, 0.64 0.002
Month -0.01 -0.07, 0.05 0.8 -0.01 -0.03, 0.00 0.10 0.04 -0.04, 0.11 0.4
Year -0.09 -0.19, 0.00 0.060 -0.16 -0.18, -0.14 <0.001 -0.45 -0.57, -0.33 <0.001
0.189

0.567

0.389

AIC 1,000

254

1,102

σ 1.74

0.396

2.13

Iowa
Net Generation -0.72 -1.4, -0.03 0.041 0.13 -0.06, 0.31 0.2 0.14 -0.40, 0.67 0.6
Month -0.14 -0.23, -0.06 0.001 -0.02 -0.04, 0.01 0.2 -0.09 -0.16, -0.03 0.007
Year -0.10 -0.21, 0.01 0.080 -0.14 -0.17, -0.11 <0.001 -0.30 -0.38, -0.21 <0.001
0.254

0.566

0.462

AIC 1,152

495

1,011

σ 2.35

0.638

1.82

1 CI = Confidence Interval

Question 2:

Among three air quality indicators (PM2.5, NO2, and SO2), which display the most significant response to variations in energy generation?

We can formulate a null and alternative hypothesis for the above research question as follows:

H0: There has been a uniform impact on all three air quality indicator by an increase in renewable energy generation in the states of California, Texas and Iowa over the period 2001 - 2021.
Ha: There has not been a uniform impact on all three air quality indicator by an increase in renewable energy generation in the states of California, Texas and Iowa over the period 2001 - 2021.

To evaluate the above research question, we plotted the distribution of quantity of the pollutants over time as demonstrated in the box plots below.

Interpretation of the Boxplots

From the boxplots, we noted the following:

PM2.5 Levels: The distribution of PM2.5 levels varies widely between states. California shows a particularly high range of PM2.5 concentrations with notable outliers, indicating episodes of very poor air quality. It’s important to look into the reasons for California’s variability, such as wildfires or urban pollution.

NO2 Levels: NO2 levels are somewhat variable, with California showing a higher median concentration. This could be associated with industrial activities or high traffic density.

SO2 Levels: Iowa has historically had a higher range of SO2 concentrations which have fluctuated over time and trended downwards. This pollutant is often associated with industrial processes and the burning of sulfur-containing fuels such as coal. The coal mining and energy generation activity in Iowa could explain the relatively higher concentrations of SO2 measure relative to California and Texas.

Summary and Conclusions

As shown in the analysis, there has been a general upward trend of renewable energy generation over time in the United States, indicating increased adoption and capacity over time. California (CA) stands out with a significantly higher generation, especially with a steep increase around 2020. Other states also show growth in renewable energy generation but to varying degrees. For instance, Texas (TX) and Iowa (IA) show notable increases. The variability in generation over time could be influenced by factors like state policies, technological advancements, and investment in renewable energy infrastructure.The overall increasing trend aligns with global efforts to transition to cleaner energy sources to reduce reliance on fossil fuels and combat climate change.

Our analysis suggest that the measured value of the pollutants has an inverse relationship or negative correlation with net generation from solar and wind in the three states. This implies that the higher the amount of energy generated from wind and solar, the lower the amount of the three criteria pollutants.

References

Kabeyi, Moses Jeremiah Barasa, and Oludolapo Akanni Olanrewaju. ‘Sustainable Energy Transition for Renewable and Low Carbon Grid Electricity Generation and Supply’. Frontiers in Energy Research, vol. 9, 2022. Frontiers, https://www.frontiersin.org/articles/10.3389/fenrg.2021.743114.

Environmental Impacts of Renewable Energy Technologies | Union of Concerned Scientists. https://www.ucsusa.org/resources/environmental-impacts-renewable-energy-technologies. Accessed 6 Dec. 2023.

Air Pollution. https://education.nationalgeographic.org/resource/air-pollution. Accessed 6 Dec. 2023.